Hyperparameter Optimization: A Spectral Approach
نویسندگان
چکیده
We give a simple, fast algorithm for hyperparameter optimization inspired by techniques from the analysis of Boolean functions. We focus on the high-dimensional regime where the canonical example is training a neural network with a large number of hyperparameters. The algorithm– an iterative application of compressed sensing techniques for orthogonal polynomials– requires only uniform sampling of the hyperparameters and is thus easily parallelizable. Experiments for training deep nets on Cifar-10 show that compared to state-of-the-art tools (e.g., Hyperband and Spearmint), our algorithm finds significantly improved solutions, in some cases matching what is attainable by hand-tuning. In terms of overall running time (i.e., time required to sample various settings of hyperparameters plus additional computation time), we are at least an order of magnitude faster than Hyperband and even more so compared to Bayesian Optimization. We also outperform Random Search 5×. Additionally, our method comes with provable guarantees and yields the first quasipolynomial time algorithm for learning decision trees under the uniform distribution with polynomial sample complexity, the first improvement in over two decades. ar X iv :1 70 6. 00 76 4v 2 [ cs .L G ] 7 J un 2 01 7
منابع مشابه
Open Loop Hyperparameter Optimization and Determinantal Point Processes
We propose the use of k-determinantal point processes in hyperparameter optimization via random search. Compared to conventional approaches where hyperparameter settings are sampled independently, a k-DPP promotes diversity. We describe an approach that transforms hyperparameter search spaces for efficient use with a k-DPP. Our experiments show significant benefits over uniform random search in...
متن کاملHyperparameter Optimization and Boosting for Classifying Facial Expressions: How good can a "Null" Model be?
One of the goals of the ICML workshop on representation and learning is to establish benchmark scores for a new data set of labeled facial expressions. This paper presents the performance of a “Null model” consisting of convolutions with random weights, PCA, pooling, normalization, and a linear readout. Our approach focused on hyperparameter optimization rather than novel model components. On t...
متن کاملCombination of Hyperband and Bayesian Optimization for Hyperparameter Optimization in Deep Learning
Deep learning has achieved impressive results on many problems. However, it requires high degree of expertise or a lot of experience to tune well the hyperparameters, and such manual tuning process is likely to be biased. Moreover, it is not practical to try out as many different hyperparameter configurations in deep learning as in other machine learning scenarios, because evaluating each singl...
متن کاملInitializing Bayesian Hyperparameter Optimization via Meta-Learning
Model selection and hyperparameter optimization is crucial in applying machine learning to a novel dataset. Recently, a subcommunity of machine learning has focused on solving this problem with Sequential Model-based Bayesian Optimization (SMBO), demonstrating substantial successes in many applications. However, for computationally expensive algorithms the overhead of hyperparameter optimizatio...
متن کاملContinuous Regularization Hyperparameters
Hyperparameter selection generally relies on running multiple full training trials, with hyperparameter selection based on validation set performance. We propose a gradient-based approach for locally adjusting hyperparameters during training of the model. Hyperparameters are adjusted so as to make the model parameter gradients, and hence updates, more advantageous for the validation cost. We ex...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1706.00764 شماره
صفحات -
تاریخ انتشار 2017